19 resultados para Molecular biology

em Helda - Digital Repository of University of Helsinki


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Progressive myoclonus epilepsy of Unverricht-Lundborg type (EPM1) is an autosomal recessively inherited disorder characterized by age of onset at 6-15 years, stimulus-sensitive myoclonus, tonic-clonic epileptic seizures and a progressive course. Mutations in the cystatin B (CSTB) gene underlie EPM1. The most common mutation underlying EPM1 is a dodecamer repeat expansion in the promoter region of CSTB. In addition, nine other mutations have been identified. CSTB, a cysteine protease inhibitor, is a ubiquitously expressed inhibitor of cathepsins, but its physiological function is unknown. The purpose of this study was to investigate CSTB gene expression and CSTB protein function in normal and pathological conditions. The basal CSTB promoter was mapped and characterized using different promoter-luciferase gene constructs. The binding activity of transcription factors to one ARE half, five Sp1 and four AP1 sites in the CSTB promoter was demonstrated. The CSTB promoter activity was clearly decreased using a CSTB promoter with "premutation" repeat expansions and in individuals with alike expansions. The expression of CSTB mRNA and protein was markedly reduced in patient cells. The endogenous CSTB protein localized to the nucleus, cytoplasm and lysosomes, and in differentiated cells merely to the cytoplasm. This suggests that the subcellular distribution of CSTB is dependent on the differentation status of the cells. The proteins representing patient missense mutations failed to associate with lysosomes, implying the importance of the lysosomal association for the proper physiological function of CSTB. Several alternatively spliced CSTB isoforms were identified. Of these CSTB2 was widely expressed with very low levels whereas the other alternatively spliced forms seemed to have limited tissue expression. In patients CSTB2 expression was reduced similarly to that of CSTB. The physiological relevance of CSTB alternative splicing remains unknown. The mouse Cstb transcript was shown to be present in all embryonic stages and adult tissues examined. The expression was highest at embryonic day 7 and in thymus, as well as in postnatal brain in the cortex, caudate putamen, thalamus, hippocampus, and in the Purkinje cell layer of the cerebellum. Our data implies that CSTB expression is tightly temporally and spatially regulated. The data presented in my thesis lay the basis for further understanding of the role of CSTB in health and disease.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Epilysin (MMP-28) is the most recently identified member of the matrix metalloproteinase (MMP) family of extracellular proteases. Together these enzymes are capable of degrading almost all components of the extracellular matrix (ECM) and are thus involved in important biological processes such as development, wound healing and immune functions, but also in pathological processes such as tumor invasion, metastasis and arthritis. MMPs do not act solely by degrading the ECM. They also regulate cell behavior by releasing growth factors and biologically active peptides from the ECM, by modulating cell surface receptors and adhesion molecules and by regulating the activity of many important mediators in inflammatory pathways. The aim of this study was to define the unique role of epilysin within the MMP-family, to elucidate how and when it is expressed and how its catalytic activity is regulated. To gain information on its essential functions and substrates, the specific aim was to characterize how epilysin affects the phenotype of epithelial cells, where it is biologically expressed. During the course of the study we found that the epilysin promoter contains a well conserved GT-box that is essential for the basic expression of this gene. Transcription factors Sp1 and Sp3 bind this sequence and could hence regulate both the basic and cell type and differentiation stage specific expression of epilysin. We cloned mouse epilysin cDNA and found that epilysin is well conserved between human and mouse genomes and that epilysin is glycosylated and activated by furin. Similarly to in human tissues, epilysin is normally expressed in a number of mouse tissues. The expression pattern differs from most other MMPs, which are expressed only in response to injury or inflammation and in pathological processes like cancer. These findings implicate that epilysin could be involved in tissue homeostasis, perhaps fine-tuning the phenotype of epithelial cells according to signals from the ECM. In view of these results, it was unexpected to find that epilysin can induce a stable epithelial to mesenchymal transition (EMT) when overexpressed in epithelial lung carcinoma cells. Transforming growth factor b (TGF-b) was recognized as a crucial mediator of this process, which was characterized by the loss of E-cadherin mediated cell-cell adhesion, elevated expression of gelatinase B and MT1-MMP and increased cell migration and invasion into collagen I gels. We also observed that epilysin is bound to the surface of epithelial cells and that this interaction is lost upon cell transformation and is susceptible to degradation by membrane type-1-MMP (MT1-MMP). The wide expression of epilysin under physiological conditions implicates that its effects on epithelial cell phenotype in vivo are not as dramatic as seen in our in vitro cell system. Nevertheless, current results indicate a possible interaction between epilysin and TGF-b also under physiological circumstances, where epilysin activity may not induce EMT but, instead, trigger less permanent changes in TGF-b signaling and cell motility. Epilysin may thus play an important role in TGF-b regulated events such as wound healing and inflammation, processes where involvement of epilysin has been indicated.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Scattering of X-rays and neutrons has been applied to the study of nanostructures with interesting biological functions. The systems studied were the protein calmodulin and its complexes, bacterial virus bacteriophage phi6, and the photosynthetic antenna complex from green sulfur bacteria, chlorosome. Information gathered using various structure determination methods has been combined to the low resolution information obtained from solution scattering. Conformational changes in calmodulin-ligand complex were studied by combining the directional information obtained from residual dipole couplings in nuclear magnetic resonance to the size information obtained from small-angle X-ray scattering from solution. The locations of non-structural protein components in a model of bacteriophage phi6, based mainly on electron microscopy, were determined by neutron scattering, deuterium labeling and contrast variation. New data are presented on the structure of the photosynthetic antenna complex of green sulfur bacteria and filamentous anoxygenic phototrophs, also known as the chlorosome. The X-ray scattering and electron cryomicroscopy results from this system are interpreted in the context of a new structural model detailed in the third paper of this dissertation. The model is found to be consistent with the results obtained from various chlorosome containing bacteria. The effect of carotenoid synthesis on the chlorosome structure and self-assembly are studied by carotenoid extraction, biosynthesis inhibition and genetic manipulation of the enzymes involved in carotenoid biosynthesis. Carotenoid composition and content are found to have a marked effect on the structural parameters and morphology of chlorosomes.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Studying neurodegeneration provides an opportunity to gain insights into normal cell physiology, and not just pathophysiology. In this thesis work the focus is on Infantile Neuronal Ceroid Lipofuscinosis (INCL). It is a recessively inherited lysosomal storage disorder. The disease belongs to the neuronal ceroid lipofuscinoses (NCLs), a group of common progressive neurodegenerative diseases of the childhood. Characteristic accumulation of autofluorescent storage material is seen in most tissues but only neurons of the central nervous system are damaged and eventually lost during the course of the disease leaving most other cell types unaffected. The disease is caused by mutations in the CLN1 gene, but the physiological function of the corresponding protein the palmitoyl protein thioesterase (PPT1) has remained elusive. The aim of this thesis work was to shed light on the molecular and cell biological mechanisms behind INCL. This study pinpointed the localization of PPT1 in axonal presynapses of neurons. It also established the role of PPT1 in early neuronal maturation as well as importance in mature neuronal synapses. This study revealed an endocytic defect in INCL patient cells manifesting itself as delayed trafficking of receptor and non-receptor mediated endocytic markers. Furthermore, this study was the first to connect the INCL storage proteins the sphingolipid activator proteins (SAPs) A and D to pathological events on the cellular level. Abnormal endocytic processing and intracellular re-localization was demonstrated in patient cells and disease model knock-out mouse neurons. To identify early affected cellular and metabolic pathways in INCL, knock-out mouse neurons were studied by global transcript profiling and functional analysis. The gene expression analysis revealed changes in neuronal maturation and cell communication strongly associated with the regulated secretory system. Furthermore, cholesterol metabolic pathways were found to be affected. Functional studies with the knock-out mouse model revealed abnormalities in neuronal maturation as well as key neuronal functions including abnormalities in intracellular calcium homeostasis and cholesterol metabolism. Together the findings, introduced in this thesis work, support the essential role of PPT1 in developing neurons as well as synaptic sites of mature neurons. Results of this thesis also elucidate early events in INCL pathogenesis revealing defective pathways ultimately leading to the neurodegenerative process. These results contribute to the understanding of the vital physiological function of PPT1 and broader knowledge of common cellular mechanisms behind neurodegeneration. These results add to the knowledge of these severe diseases offering basis for new approaches in treatment strategies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The studies presented in this thesis aimed to a better understanding of the molecular biology of Sweet potato chlorotic stunt virus (SPCSV, Crinivirus, Closteroviridae) and its role in the development of synergistic viral diseases. The emphasis was on the severe sweet potato virus disease (SPVD) that results from a synergistic interaction of SPCSV and Sweet potato feathery mottle virus (SPFMV, Potyvirus, Potyviridae). SPVD is the most important disease affecting sweetpotato. It is manifested as a significant increase in symptom severity and SPFMV titres. This is accompanied by a dramatic sweetpotato yield reduction. SPCSV titres remain little affected in the diseased plants. Viral synergistic interactions have been associated with the suppression of an adaptive general defence mechanism discovered in plants and known as RNA silencing. In the studies of this thesis two novel proteins (RNase3 and p22) identified in the genome of a Ugandan SPCSV isolate were shown to be involved in suppression of RNA silencing. RNase3 displayed a dsRNA-specific endonuclease activity that enhanced the RNA-silencing suppression activity of p22. Comparative analyses of criniviral genomes revealed variability in the gene content at the 3´end of the genomic RNA1. Molecular analyses of different isolates of SPCSV indicated a marked intraspecific heterogeneity in this region where the p22 and RNase3 genes are located. Isolates of the East African strain of SPCSV from Tanzania and Peru and an isolate from Israel were missing a 767-nt fragment that included the p22 gene. However, regardless of the absence of p22, all SPCSV isolates acted synergistically with SPFMV in co-infected sweetpotato, enhanced SPFMV titres and caused SPVD. These results showed that p22 is dispensable for development of SPVD. The role of RNase3 in SPVD was then studied by generating transgenic plants expressing the RNase3 protein. These plants had increased titres of SPFMV (ca. 600-fold higher in comparison with nontransgenic plants) 2-3 weeks after graft inoculation and displayed the characteristic SPVD symptoms. RNA silencing suppression (RSS) activity of RNase3 was detected in agroinfiltrated leaves of Nicotiana bethamiana. In vitro studies showed that RNase3 was able to cleave small interferring RNAs (siRNA) to products of ~14-nt. The data thus identified RNase3 as a suppressor of RNA silencing able to cleave siRNAs. RNase3 expression alone was sufficient for breaking down resistance to SPFMV in sweetpotato and for the development of SPVD. Similar RNase III-like genes exist in animal viruses which points out a novel and possibly more general mechanism of RSS by viruses. A reproducible method of sweetpotato transformation was used to target RNA silencing against the SPCSV polymerase region (RdRp) with an intron-spliced hairpin construct. Hence, engineered resistance to SPCSV was obtained. Ten out of 20 transgenic events challenged with SPCSV alone showed significantly reduced virus titres. This was however not sufficient to prevent SPVD upon coinfection with SPFMV. Immunity to SPCSV seems to be required to control SPVD and targeting of different SPCSV regions need to be assessed in further studies. Based on the identified key role of RNase3 in SPVD the possibility to design constructs that target this gene might prove more efficient in future studies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis studies human gene expression space using high throughput gene expression data from DNA microarrays. In molecular biology, high throughput techniques allow numerical measurements of expression of tens of thousands of genes simultaneously. In a single study, this data is traditionally obtained from a limited number of sample types with a small number of replicates. For organism-wide analysis, this data has been largely unavailable and the global structure of human transcriptome has remained unknown. This thesis introduces a human transcriptome map of different biological entities and analysis of its general structure. The map is constructed from gene expression data from the two largest public microarray data repositories, GEO and ArrayExpress. The creation of this map contributed to the development of ArrayExpress by identifying and retrofitting the previously unusable and missing data and by improving the access to its data. It also contributed to creation of several new tools for microarray data manipulation and establishment of data exchange between GEO and ArrayExpress. The data integration for the global map required creation of a new large ontology of human cell types, disease states, organism parts and cell lines. The ontology was used in a new text mining and decision tree based method for automatic conversion of human readable free text microarray data annotations into categorised format. The data comparability and minimisation of the systematic measurement errors that are characteristic to each lab- oratory in this large cross-laboratories integrated dataset, was ensured by computation of a range of microarray data quality metrics and exclusion of incomparable data. The structure of a global map of human gene expression was then explored by principal component analysis and hierarchical clustering using heuristics and help from another purpose built sample ontology. A preface and motivation to the construction and analysis of a global map of human gene expression is given by analysis of two microarray datasets of human malignant melanoma. The analysis of these sets incorporate indirect comparison of statistical methods for finding differentially expressed genes and point to the need to study gene expression on a global level.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Transposons are mobile elements of genetic material that are able to move in the genomes of their host organisms using a special form of recombination called transposition. Bacteriophage Mu was the first transposon for which a cell-free in vitro transposition reaction was developed. Subsequently, the reaction has been refined and the minimal Mu in vitro reaction is useful in the generation of comprehensive libraries of mutant DNA molecules that can be used in a variety of applications. To date, the functional genetics applications of Mu in vitro technology have been subjected to either plasmids or genomic regions and entire genomes of viruses cloned on specific vectors. This study expands the use of Mu in vitro transposition in functional genetics and genomics by describing novel methods applicable to the targeted transgenesis of mouse and the whole-genome analysis of bacteriophages. The methods described here are rapid, efficient, and easily applicable to a wide variety of organisms, demonstrating the potential of the Mu transposition technology in the functional analysis of genes and genomes. First, an easy-to-use, rapid strategy to generate construct for the targeted mutagenesis of mouse genes was developed. To test the strategy, a gene encoding a neuronal K+/Cl- cotransporter was mutagenised. After a highly efficient transpositional mutagenesis, the gene fragments mutagenised were cloned into a vector backbone and transferred into bacterial cells. These constructs were screened with PCR using an effective 3D matrix system. In addition to traditional knock-out constructs, the method developed yields hypomorphic alleles that lead into reduced expression of the target gene in transgenic mice and have since been used in a follow-up study. Moreover, a scheme is devised to rapidly produce conditional alleles from the constructs produced. Next, an efficient strategy for the whole-genome analysis of bacteriophages was developed based on the transpositional mutagenesis of uncloned, infective virus genomes and their subsequent transfer into susceptible host cells. Mutant viruses able to produce viable progeny were collected and their transposon integration sites determined to map genomic regions nonessential to the viral life cycle. This method, applied here to three very different bacteriophages, PRD1, ΦYeO3 12, and PM2, does not require the target genome to be cloned and is directly applicable to all DNA and RNA viruses that have infective genomes. The method developed yielded valuable novel information on the three bacteriophages studied and whole-genome data can be complemented with concomitant studies on individual genes. Moreover, end-modified transposons constructed for this study can be used to manipulate genomes devoid of suitable restriction sites.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Transposons, mobile genetic elements that are ubiquitous in all living organisms have been used as tools in molecular biology for decades. They have the ability to move into discrete DNA locations with no apparent homology to the target site. The utility of transposons as molecular tools is based on their ability to integrate into various DNA sequences efficiently, producing extensive mutant clone libraries that can be used in various molecular biology applications. Bacteriophage Mu is one of the most useful transposons due to its well-characterized and simple in vitro transposition reaction. This study establishes the properties of the Mu in vitro transposition system as a versatile multipurpose tool in molecular biology. In addition, this study describes Mu-based applications for engineering proteins by random insertional transposon mutagenesis in order to study structure-function relationships in proteins. We initially characterized the properties of the minimal Mu in vitro transposition system. We showed that the Mu transposition system works efficiently and accurately and produces insertions into a wide spectrum of target sites in different DNA molecules. Then, we developed a pentapeptide insertion mutagenesis strategy for inserting random five amino acid cassettes into proteins. These protein variants can be used especially for screening important sites for protein-protein interactions. Also, the system may produce temperature-sensitive variants of the protein of interest. Furthermore, we developed an efficient screening system for high-resolution mapping of protein-protein interfaces with the pentapeptide insertion mutagenesis. This was accomplished by combining the mutagenesis with subsequent yeast two-hybrid screening and PCR-based genetic footprinting. This combination allows the analysis of the whole mutant library en masse, without the need for producing or isolating separate mutant clones, and the protein-protein interfaces can be determined at amino acid accuracy. The system was validated by analysing the interacting region of JFC1 with Rab8A, and we show that the interaction is mediated via the JFC1 Slp homology domain. In addition, we developed a procedure for the production of nested sets of N- and C-terminal deletion variants of proteins with the Mu system. These variants are useful in many functional studies of proteins, especially in mapping regions involved in protein-protein interactions. This methodology was validated by analysing the region in yeast Mso1 involved in an interaction with Sec1. The results of this study show that the Mu in vitro transposition system is versatile for various applicational purposes and can efficiently be adapted to random protein engineering applications for functional studies of proteins.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The type III secretion system (T3SS) is an essential requirement for the virulence of many Gram-negative bacteria which infect plants, animals and men. Pathogens use the T3SS to deliver effector proteins from the bacterial cytoplasm to the eukaryotic host cells, where the effectors subvert host defenses. The best candidates for directing effector protein traffic are the bacterial type III-associated appendages, called needles or pili. In plant pathogenic bacteria, the best characterized example of a T3SS-associated appendage is the HrpA pilus of the plant pathogen Pseudomonas syringae pv. tomato DC3000. The components of the T3SS in plant pathogens are encoded by a cluster of hrp (hypersensitive reaction and pathogenicity) genes. Two major classes of T3SS-secreted proteins are: harpin proteins such as HrpZ which are exported into extracellular space, and avirulence (Avr) proteins such as AvrPto which are translocated directly to the plant cytoplasm. This study deals with the structural and functional characterization of the T3SS-associated HrpA pilus and the T3SS-secreted harpins. By insertional mutagenesis analysis of HrpA, we located the optimal epitope insertion site in the amino-terminus of HrpA, and revealed the potential application of the HrpA pilus as a carrier of antigenic determinants for vaccination. By pulse-expression of proteins combined with immuno-electron microscopy, we discovered the Hrp pilus assembly strategy as addition of HrpA subunits to the distal end of the growing pilus, and we showed for the first time that secretion of HrpZ occurs at the tip of the pilus. The pilus thus functions as a conduit delivering proteins to the extracellular milieu. By using phage-display and scanning-insertion mutagenesis methods we identified a conserved HrpZ-binding peptide and localized the peptide-binding site to the central domain of HrpZ. We also found that the HrpZ specifically interacts with a host bean protein. Taken together, the current results provide deeper insight into the molecular mechanism of T3SS-associated pilus assembly and effector protein translocation, which will be helpful for further studies on the pathogenic mechanisms of Gram-negative bacteria and for developing new strategies to prevent bacterial infection.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Viruses are biological entities able to replicate only within their host cells. Accordingly, entry into the host is a crucial step of the virus life-cycle. The focus of this study was the entry of bacterial membrane-containing viruses into their host cells. In order to reach the site of replication, the cytoplasm of the host, bacterial viruses have to traverse the host cell envelope, which consists of several distinct layers. Lipid membrane is a common feature among animal viruses but not so frequently observed in bacteriophages. There are three families of icosahedral bacteriophages that contain lipid membranes. These viruses belong to families Cystoviridae, Tectiviridae, and Corticoviridae. During the course of this study the entry mechanisms of phages representing the three viral families were investigated. We employed a range of microbiological, biochemical, molecular biology and microscopy techniques that allowed us to dissect phage entry into discrete steps: receptor binding, penetration through the outer membrane, crossing the peptidoglycan layer and interaction with the cytoplasmic membrane. We determined that bacteriophages belonging to the Cystoviridae, Tectiviridae, and Corticoviridae viral families use completely different strategies to penetrate into their host cells.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Autoimmune diseases are a major health problem. Usually autoimmune disorders are multifactorial and their pathogenesis involves a combination of predisposing variations in the genome and other factors such as environmental triggers. APECED (autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy) is a rare, recessively inherited, autoimmune disease caused by mutations in a single gene. Patients with APECED suffer from several organ-specific autoimmune disorders, often affecting the endocrine glands. The defective gene, AIRE, codes for a transcriptional regulator. The AIRE (autoimmune regulator) protein controls the expression of hundreds of genes, representing a substantial subset of tissue-specific antigens which are presented to developing T cells in the thymus and has proven to be a key molecule in the establishment of immunological tolerance. However, the molecular mechanisms by which AIRE mediates its functions are still largely obscure. The aim of this thesis has been to elucidate the functions of AIRE by studying the molecular interactions it is involved in by utilizing different cultured cell models. A potential molecular mechanism for exceptional, dominant, inheritance of APECED in one family, carrying a glycine 228 to tryptophan (G228W) mutation, was described in this thesis. It was shown that the AIRE polypeptide with G228W mutation has a dominant negative effect by binding the wild type AIRE and inhibiting its transactivation capacity in vitro. The data also emphasizes the importance of homomultimerization of AIRE in vivo. Furthermore, two novel protein families interacting with AIRE were identified. The importin alpha molecules regulate the nuclear import of AIRE by binding to the nuclear localization signal of AIRE, delineated as a classical monopartite signal sequence. The interaction of AIRE with PIAS E3 SUMO ligases, indicates a link to the sumoylation pathway, which plays an important role in the regulation of nuclear architecture. It was shown that AIRE is not a target for SUMO modification but enhances the localization of SUMO1 and PIAS1 proteins to nuclear bodies. Additional support for the suggestion that AIRE would preferably up-regulate genes with tissue-specific expression pattern and down-regulate housekeeping genes was obtained from transactivation studies performed with two models: human insulin and cystatin B promoters. Furthermore, AIRE and PIAS activate the insulin promoter concurrently in a transactivation assay, indicating that their interaction is biologically relevant. Identification of novel interaction partners for AIRE provides us information about the molecular pathways involved in the establishment of immunological tolerance and deepens our understanding of the role played by AIRE not only in APECED but possibly also in several other autoimmune diseases.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Filamentous fungi of the subphylum Pezizomycotina are well known as protein and secondary metabolite producers. Various industries take advantage of these capabilities. However, the molecular biology of yeasts, i.e. Saccharomycotina and especially that of Saccharomyces cerevisiae, the baker's yeast, is much better known. In an effort to explain fungal phenotypes through their genotypes we have compared protein coding gene contents of Pezizomycotina and Saccharomycotina. Only biomass degradation and secondary metabolism related protein families seem to have expanded recently in Pezizomycotina. Of the protein families clearly diverged between Pezizomycotina and Saccharomycotina, those related to mitochondrial functions emerge as the most prominent. However, the primary metabolism as described in S. cerevisiae is largely conserved in all fungi. Apart from the known secondary metabolism, Pezizomycotina have pathways that could link secondary metabolism to primary metabolism and a wealth of undescribed enzymes. Previous studies of individual Pezizomycotina genomes have shown that regardless of the difference in production efficiency and diversity of secreted proteins, the content of the known secretion machinery genes in Pezizomycotina and Saccharomycotina appears very similar. Genome wide analysis of gene products is therefore needed to better understand the efficient secretion of Pezizomycotina. We have developed methods applicable to transcriptome analysis of non-sequenced organisms. TRAC (Transcriptional profiling with the aid of affinity capture) has been previously developed at VTT for fast, focused transcription analysis. We introduce a version of TRAC that allows more powerful signal amplification and multiplexing. We also present computational optimisations of transcriptome analysis of non-sequenced organism and TRAC analysis in general. Trichoderma reesei is one of the most commonly used Pezizomycotina in the protein production industry. In order to understand its secretion system better and find clues for improvement of its industrial performance, we have analysed its transcriptomic response to protein secretion stress conditions. In comparison to S. cerevisiae, the response of T. reesei appears different, but still impacts on the same cellular functions. We also discovered in T. reesei interesting similarities to mammalian protein secretion stress response. Together these findings highlight targets for more detailed studies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A repetitive sequence collection is one where portions of a base sequence of length n are repeated many times with small variations, forming a collection of total length N. Examples of such collections are version control data and genome sequences of individuals, where the differences can be expressed by lists of basic edit operations. Flexible and efficient data analysis on a such typically huge collection is plausible using suffix trees. However, suffix tree occupies O(N log N) bits, which very soon inhibits in-memory analyses. Recent advances in full-text self-indexing reduce the space of suffix tree to O(N log σ) bits, where σ is the alphabet size. In practice, the space reduction is more than 10-fold, for example on suffix tree of Human Genome. However, this reduction factor remains constant when more sequences are added to the collection. We develop a new family of self-indexes suited for the repetitive sequence collection setting. Their expected space requirement depends only on the length n of the base sequence and the number s of variations in its repeated copies. That is, the space reduction factor is no longer constant, but depends on N / n. We believe the structures developed in this work will provide a fundamental basis for storage and retrieval of individual genomes as they become available due to rapid progress in the sequencing technologies.